专利摘要:
An encoding mechanism encodes a video stream to provide optimum quality for a given bit rate. The encoding engine divides the video sequence into a collection of scene sequences. Each scene sequence includes video frames captured from a particular capture point. The encoding engine resamples each sequence of scenes over a range of different resolutions, encodes each resampled sequence with a range of quality parameters, and then oversamples each encoded sequence to the original resolution of the video sequence. For each oversampled sequence, the coding engine computes a quality metric and generates a data point that includes the quality metric and the resampling resolution. The encoding mechanism collects all such data points and then computes the convex hull of the resulting data set. Based on all convex hulls across all scene sequences, the encoding mechanism determines an ideal collection of scene sequences for a range of bit rates.
公开号:BR112020000998A2
申请号:R112020000998-9
申请日:2018-07-16
公开日:2020-07-14
发明作者:Ioannis Katsavounidis
申请人:Netflix, Inc.;
IPC主号:
专利说明:

[001] [001] This application claims the priority benefit of the United States Provisional Patent Application entitled “ENCODING TECHNIQUE FOR OPTIMIZING DISTORTION AND BITRATE”, filed on July 18, 2017 and having Serial Number 62 / 534.170, and this application claims the priority benefit of the United States Patent Application entitled “ENCODING TECHNIQUE FOR OPTIMIZING DISTORTION AND BITRATE”, filed on July 12, 2018 and bearing Serial Number 16 / 034,303. The subject matter of these “related requests is incorporated into this document by reference.
[002] [002] Modalities of the present invention in general relate to video encoding and, more specifically, encoding techniques to optimize distortion and bit rate.
[003] [003] A video streaming service provides access to a library of media titles that can be played on a number of different endpoint devices. Each endpoint device can connect to the video streaming service under different connection conditions, including available bandwidth and latency, among others. In addition, each different device may include different hardware to deliver the video content to the end user. For example, a given endpoint device may include a display screen having a particular screen size and a particular screen resolution.
[004] [004] Typically, an endpoint device that connects to a streaming video service runs an endpoint application that determines, for a given media title in the video content library, an appropriate version of the title media to stream to the endpoint device. Each different version of a given media title is usually encoded using a different bit rate, and different versions of the media title have resolutions, scale factors and / or other parameters typically associated with video content that differ from each other. During playback of the media title on the endpoint device, the endpoint application selects the appropriate version of the media title to stream to the endpoint device based on factors such as network conditions, connection quality information and hardware specifications for the endpoint device.
[005] [005] As noted above, to prepare a media title for streaming in the manner previously described, the media title is encoded using multiple different bit rates. In this way, an encoding application performs individual “monolithic” encodings of the complete media title, using a different set of encoding parameters for each encoding. Each different encoding can be associated with a different quality metric that objectively indicates the level of distortion introduced in that encoded version of the media title through the encoding process. The quality metric associated with a given encoding typically depends on the encoding parameters used to generate that encoding. For example, an encoding generated with a high bit rate compared to another encoding may have a higher quality metric compared to that other encoding.
[006] [006] Encoding a media title with different encoding parameters typically requires different computational resources and different storage resources. For example, generating encoding with a high bit rate and high quality metric generally consumes more computational / storage resources than generating an encoding with a low bit rate and low quality metric. A conventional encoding application can select a given set of encoding parameters to generate a single monolithic encoding in order to satisfy a particular target quality metric for that encoding.
[007] [007] However, a problem with this approach is that not all parts of a media title require the same encoding parameters to satisfy a given target quality metric, and conventional encoding applications use the same encoding parameters for the title full media. As a result, a conventionally encoded media title can consume excessive computing and storage resources to satisfy the target quality metric, although some parts of the media title do not need these resources to satisfy the same metric. This inefficiency unnecessarily wastes computing resources and storage resources.
[008] [008] As indicated above, what is needed in practice is a more efficient technique for encoding video sequences. SUMMARY OF THE INVENTION
[009] [009] One embodiment of the present invention exposes a computer-implemented method, including generating a first set of encoded blocks for a source video sequence, generating a first set of data points based on the first set of encoded blocks, performing a or more convex hull operations through the first set of data points to compute a first subset of data points that are optimized through at least two metrics, compute a first slope value between a first data point included in the first subset of data points and a second data point included in the first subset of data points, and determine, based on the first slope value, that a first coded block associated with the first data point should be included in a final coded version of the source video stream.
[010] [010] At least one technological improvement of the techniques revealed in relation to the prior art is that performing optimization operations on the granularity of the coded blocks reduces coding inefficiencies associated with conventional coding techniques. As a result, the final encoded version of the source video stream can be streamed to endpoint devices with a visual quality increased to a target bit rate. Conversely, the final encoded version of the source video stream can be streamed to end point devices with a reduced bit rate for a target visual quality. BRIEF DESCRIPTION OF THE DRAWINGS
[011] [011] In order that the previously reported features of the present invention can be understood in detail, a more particular description of the invention, summarized above, can be made with reference to modalities, some of which are illustrated in the accompanying drawings. It is to be noted, however, that the attached drawings illustrate only typical modalities of this invention and, therefore, are not to be considered as limiting its scope, since the invention can admit other equally effective modalities.
[012] [012] Figure 1A illustrates a cloud computing environment configured to implement one or more aspects of the present invention; figure 1B is a more detailed illustration of the coding mechanisms of figure 1A, according to various embodiments of the present invention; figure 2 illustrates how the encoding mechanisms of figure 1B divide a video sequence into scene sequences, according to various embodiments of the present invention; figure 3 illustrates how the coding mechanisms of figure 1B process the scene sequences of figure 2 to generate a data set, according to various modalities of the present invention;
[013] [013] In the following description, numerous specific details are set out to provide a more complete understanding of the present invention. However, it will be clear to those skilled in the art that the present invention can be practiced without one or more of these specific details. In other instances, well-known features are not described in order to avoid obscuring the present invention.
[014] [014] As discussed earlier, conventional encoding techniques suffer from specific inefficiencies associated with performing “monolithic” encoding of video sequences. These inefficiencies arise because conventional encoding techniques encode all parts of a video sequence with the same encoding parameters to satisfy a given quality metric, despite the fact that some parts of the video sequence can be encoded with different encoding parameters and still satisfy the same quality metric.
[015] [015] To address this issue, embodiments of the present invention include an encoding mechanism configured to encode different scene sequences within a source video sequence with different encoding parameters that optimize bit rates for a given level of distortion. When encoding a sequence of scenes, the encoding mechanism resamples the sequence of scenes to a range of different resolutions and then encodes each resampled sequence using a range of quality parameters. The encoding mechanism then oversamples each sequence encoded to the original resolution of the source video sequence and computes a quality metric for the resulting oversampled sequences. Based on the oversampled sequences and corresponding quality metrics for each scene sequence, the encoding engine generates different encoded versions of the source video sequence. Each such version is composed of multiple sequences of scenes encoded with potentially different encoding parameters.
[016] [016] An advantage of this approach is that parts of the source video sequence needing specific encoding parameters to satisfy a given quality metric are encoded precisely with those specific encoding parameters. In addition, other parts of the source video sequence can be encoded with other “appropriately chosen encoding parameters. Therefore, encoded versions of the source video stream are generated in a more efficient way.
[017] [017] Figure 1A illustrates a cloud computing environment configured to implement one or more aspects of the present invention. As shown, a system 100 includes a host computer 110 coupled to a computer cloud 130. Host computer 110 includes processor 112, input / output devices (1/0) 114 and memory 116 coupled together.
[018] [018] Processor 112 can be any technically viable form of processing device configured to process data and execute program code. Processor 112 can be, for example, a central processing unit (CPU), a graphics processing unit (GPU), an application specific integrated circuit (ASIC), an array of field programmable ports (FPGA), any combination technically feasible of such units and so on.
[019] [019] Input / output devices 114 may include devices configured to receive input, including, for example, a keyboard, a mouse, and so on. Input / output devices 114 may also include devices configured to provide outputs, including, for example, a display device, a speaker, and so on. Input / output devices 114 may additionally include devices configured for both receiving and providing input and output, respectively, including, for example, a touch screen, a universal serial bus (USB) port and so on.
[020] [020] Memory 116 may include any technically viable storage medium configured to store data and software applications. Memory 116 can be, for example, a hard disk, a random access memory (RAM) module, a read-only memory (ROM), and so on. Memory 116 includes a host encoding mechanism 118 and a database
[021] [021] Host encoding mechanism 118 is a software application that, when executed by processor 112, performs an encoding operation with media content stored in database 120 and / or with an external storage resource. Host encoding mechanism 118 is configured to operate in conjunction with various cloud encoding mechanisms discussed in more detail below.
[022] [022] Computer cloud 130 includes a plurality of cloud computers 140 (0) to 140 (N). Any cloud computer 140 can be a physically separate computing device or a virtualized instance of a computing device. Each cloud computer 140 includes a processor 142, input / output devices 144 and memory 146, coupled together. A given processor 142 can be any technically viable form of processing device configured to process data and execute program code, including a CPU, a GPU, an ASIC, an FPGA, any technically viable combination of such units, and so on. A given set of input / output devices 144 can include devices configured to receive inputs, including, for example, a keyboard, a mouse, and so on, similar to the input / output devices 114 discussed earlier. Each memory 146 is a storage medium configured to store data and software applications, including the cloud encoding mechanism 148 and the database.
[023] [023] The cloud encoding mechanisms 148 (0) to 148 (N) are configured to operate in conjunction with the host encoding mechanism 118 in order to perform various parts of an encoding operation. In general, the host coding mechanism 118 coordinates the operation of the cloud coding mechanisms 148 (0) to 148 (N), and can perform tasks such as distributing processing tasks to those mechanisms, collecting processed data from each mechanism and so on. People familiar with cloud computing will understand that cloud coding mechanisms 148 (0) to 148 (N) can operate substantially in parallel with each other. Therefore, host encoding mechanism 118 can efficiently perform complex encoding tasks by configuring cloud encoding mechanisms 148 to perform separate tasks simultaneously. As a general matter, host encoding mechanism 118 and cloud encoding mechanisms 148 represent different modules within a distributed software entity, as described in more detail below in combination with figure 1B.
[024] [024] Figure 1B is a more detailed illustration of the coding mechanisms of figure 1A, in accordance with various embodiments of the present invention. As shown, an encoding mechanism 160 includes host encoding mechanism 118 and cloud encoding mechanisms 148 (0) to 148 (N). As a general matter, encoding mechanism 160 constitutes a distributed software entity configured to perform one or more different encoding operations via execution of host encoding mechanism 118 and cloud encoding mechanisms 140. In particular, the encoding mechanism 160 processes a source video stream 170 to generate a set of encoded video streams 180. The source video stream 170 is a media title that can be included in a content library associated with a streaming service of video. Each encoded video sequence 180 is a different version of that media title encoded with different (and potentially varying) encoding parameters.
[025] [025] To perform the encoding operation, the encoding mechanism 160 preprocesses the source video sequence 170 to remove extraneous pixels and then splits the source video sequence 170 into a plurality of scene sequences. Each scene sequence includes frames captured continuously from a given camera or capture point. This procedure is discussed in combination with figure 2. The encoding mechanism 160 then resamples each sequence of scenes in one or more different resolutions, and processes all resampled sequences to generate a data set. The resampling process is discussed in combination with figure 3. Generation of the data set based on resampled sequences is discussed in combination with figure 4. The coding mechanism 160 then generates, based on the data set, a convex hull of data points that maximize bit rate for a given level of distortion, as discussed in combination with Figures 5A-5B and Figure 6. Based on all convex hull points across all scene sequences, the encoding mechanism 160 generates the set of encoded video sequences 180. These encoded video sequences optimize distortion and bit rate, as discussed in combination with figures 7-9. The coding operation discussed in combination with figures 3-9 is also presented as a series of steps in combination with figures 10-11.
[026] [026] Figure 2 illustrates how the encoding mechanisms of figure 1B divide a video sequence into scene sequences, according to various modalities of the present invention. As mentioned previously in combination with figures 1-2, the encoding mechanism 160 is configured to perform an encoding operation to generate different encoded versions of the source video sequence 170, where each different version minimizes distortion for a given data rate. bits and / or optimizes distortion and bit rate. A first step in which the encoding operation is illustrated in figure 2. As shown, a scene analyzer 200 is configured to process the source video sequence 170 to generate scene sequences 220 (0) to 220 (P). The scene analyzer 200 is a software module included in the encoding mechanism
[027] [027] Scene analyzer 200 generates each scene sequence 220 to have the same resolution as the source video sequence 170. However, each scene sequence 220 includes a different sequence of video frames that corresponds to a different "scene" . In the context of this disclosure, a “scene” can be a sequence of frames captured continuously from a single camera or virtual representation of a camera (for example, in the case of computer animated video sequences). When generating scene sequences 220, scene analyzer 200 can also remove extraneous pixels from source video sequence 170. For example, scene analyzer 200 can remove pixels included in black bars along edge sections of the source video 170.
[028] [028] The scene analyzer 200 can determine which frames of the source video sequence 170 correspond to each different scene using many different techniques. For example, scene analyzer 200 can identify a set of sequential frames having a continuous distribution of pixel values that do not change significantly over a subset of two or more sequential frames. Alternatively, the scene analyzer 200 can compare characteristics present in each frame and identify sequential frames having similar characteristics. those skilled in the art will understand that there are many techniques for analyzing a source video stream in a separate scene stream. When analyzing the source video sequence 170 in this mode, the encoding mechanism 160 processes each scene sequence 220 to generate a different data set, as described below in combination with figure 3.
[029] [029] Figure 3 illustrates how the coding mechanisms in figure 1B process the scene sequences in figure 2 to generate a data set, according to various modalities of the present invention. As shown, a resampler 300 processes a sequence of scenes 220 to generate the resampled sequences 320 (0) to 320 (M). Each resampled sequence 320 has a different resolution, as shown. The resampled sequence 320 (0) has a resolution of 4,096 x 2,048, the resampled sequence 320 (1) has a resolution of 2,048 x 1,024, and the resampled sequence 220 (M) has a resolution of 256 x 144. The set of sequences resampled 320 corresponds to a resolution scale 330 that is associated with scene sequence 220.
[030] [030] The resampler 300 can generate the scale of resolutions 330 to include any distribution of resolutions. In practice, however, resampler 300 first generates resampled sequence 320 (0) to have the same resolution as scene sequence 220 (or source video sequence 170), and then generates each subsequent resampled sequence 320 (1) for onwards to have a resolution that is a constant fraction of the previous resolution. In practice, the ratio between the resolution of a given 320 (H) resampled sequence and an earlier 320 (H-1) resampled sequence is approximately 1.5.
[031] [031] However, in several modalities a denser resolution scale can be used, that is, with a ratio between the resolution of a given 320 (H) resampled sequence and an earlier 320 (H-1) resampled sequence that 1.5, such as 1.414 or 1.26, or a coarser resolution scale, that is, with a ratio between the resolution of a given resampled sequence 320 (H) and an earlier resampled sequence 320 (H-1 ) of more than 1.5, such as 2.0 or 3.0. The density of the resolution scale 330 may also depend on the characteristics of the video scene, in such a way that it can uniformly cover the desired quality levels. Additional restrictions, such as the amount of CPU you want to spend when coding a certain sequence, can be used to decide the density of resolution scales.
[032] [032] When generating the resolution scale 330, the encoding mechanism 160 then performs a set of parallel processing threads 340 to process each different resampled sequence 320. Each processing thread 340 generates, based on the resampled sequence 320 introduced for this purpose, a collection of data points 350. Processing thread 340 (0) generates data points 350 (0), processing thread 350 (1 ) generates data points 350 (1) and so on for all processing threads
[033] [033] Figure 4 is a more detailed illustration of the processing thread of Figure 3, according to various modalities of the present invention. As shown, processing thread 340 receives a resampled sequence 320 and generates, through a set of parallel sub-threads 450 (0) to 450 (L), data points 350. Each sub-thread 450 includes an encoder 400, a decoder 410, an oversample 420 and a metric analyzer 430. Sub-chaining 450 (0) includes encoder 400 (0), decoder 410 (0), oversampler 420 (0) and metric analyzer 430 (0), sub-chaining 450 (1) includes encoder 400 (1), decoder 410 (1), oversampling 420 (1) and metric analyzer 430 (1) and so on for all sub-chaining 450. Encoders 400 and decoders 410 within each subchaining 450 can implement any (any) technically feasible (viable) encoding / decoding algorithm (s), including advanced video encoding (AVC), high efficiency video encoding (HEVC) or VP9, among others.
[034] [034] During execution of processing thread 340, each encoder 400 (0) to 400 (L) first encodes the resampled sequence 320 with a different quantization parameter (QP). Encoder 400 (0) encodes resampled sequence 320 with QP = 0, encoder 400 (1) encodes resampled sequence 320 with QP = 1 and encoder 400 (L) encodes resampled sequence 320 with QP = L. In general, the number of L encoders corresponds to the number of QPs available for the given algorithm implemented by the 400 encoders. In modalities where the 400 encoders implement AVC encoding algorithm using the x264 implementation, the 400 encoders can perform the encoding operation described using different constant rate factors (CRFs) instead of QPs. In various embodiments, encoders 400 can vary any encoding parameter in addition to QP or CRF.
[035] [035] Importantly, the encoded resampled sequences generated by the encoders 400 at the end can be included in the encoded video sequence 180 shown in figure 2B. In the context of this disclosure, these resampled encoded sequences may be referred to in this document as "blocks". A "block" generally includes a sequence of video frames encoded with a particular set of encoding parameters. In practice, each block is resampled with a particular resolution and then encoded with a given QP. Also, each block in general is derived from a given sequence of scenes. However, those skilled in the art will understand that a “block” in the context of video encoding can represent a variety of different constructs, including a group of images (GOP), a sequence of frames, a plurality of frame sequences, and so on. .
[036] [036] Since encoders 400 encode resampled sequences 320 with different QPs in the described mode, each sub-thread 450 proceeds in a relatively similar mode. Decoders 410 receive the encoded sequences and then decode those sequences. Therefore, each video sequence produced via 420 (0) to 420 (L) oversamples has the same resolution. However, these video streams can have different qualities because they are encoded with different QOPs.
[037] [037] In one embodiment, oversamples 420 oversample the decoded strings for target resolutions that may be relevant to the display characteristics of a class of endpoint devices. For example, a certain video can be delivered at a resolution of 3,840 x 2,160, and also be proposed to be consumed by a large number of screens at a resolution of 1,920 x
[038] [038] The 430 metrics analyzer analyzes the strings - “oversampled to generate an objective quality (OM) metric for each sequence. The metric analyzer 330 can implement, for example, a fusion algorithm for evaluating multiple video methods (VMAF)
[039] [039] Each metric analyzer 430 then generates a different data point 440 that includes the resolution of the resampled sequence 320, the QP implemented by the respective encoder 400 and the computed OM. Thus, for each different QP, processing thread 340 generates a separate data point, shown as data points 440 (0) to 440 (L). Importantly, each data point 440 corresponds to a particular resampled / encoded version of a given scene sequence 220. As described in more detail below, encoding mechanism 160 selects resampled / encoded versions of each scene sequence 220 for inclusion in the encoded video streams 180 based on the associated 400 data points. The processing thread 340 brings together all such data points 440 into data points 350, as also shown in figure 3.
[040] [040] Referring again to figure 3, the coding mechanism 160 generates a different set of data points 350 (0) to 350 (M) for each different resampled sequence 320 (0) to 320 (M), and then gathers these data points 350 in the data set 360. Therefore, the data set 360 includes M * L data points, because encoder 160 generates a data point in the data set 360 for each combination of the resampled M strings 320 different and the different L QOPs. You do not necessarily need to use the same number of QPs or the same QP values for each resolution, but instead use a fully customized number of QPs and QP values that are appropriate for each scene. The encoding mechanism 160 then performs a processing operation discussed below in combination with figures 5A-5B to identify the particular data points within the data set 360 that minimize distortion and / or bit rate.
[041] [041] Figure 5A is a bit rate versus quality graph that is generated based on the data set of figure 3, according to various modalities of the present invention. As shown, a graph 500 includes a bit rate axis 510 and a quality metrics (OM) axis 520. Graph 500 also includes the quality curves 502, 504 and 506 plotted against the data rate axis. bits 510 and the quality metrics axis 520. Each curve shown corresponds to a different resolution encoding for a particular sequence of scenes 220 and for this reason can be derived from a particular set of data points 350, where each data point 440 in a given set corresponds to a particular combination of resolution, QP and OM. The encoding mechanism 160 generates the data points included in curves 502, 504 and 506 by converting the resolution of each data point 440 to a given bit rate. The encoding mechanism 160, for example, can divide the total number of bits required for the given resolution by the length of the associated scene sequence 320.
[042] [042] The coding mechanism 160 is configured to reprocess the data set 160 plotted in Figure 5A to replace QM with a distortion metric. The coding mechanism 160 can compute a given distortion metric by inverting an OM value, subtracting the OM value from a constant value, or performing other known techniques to convert quality into distortion. The coding mechanism 160 then generates a convex hull based on the converted values, as discussed below in combination with figure 5B.
[043] [043] Figure 5B is a graph of convex hull data points that is generated based on the data set of figure 3, according to various modalities of the present invention. As shown, graph 550 includes bit rate axis 560 and distortion axis 570. Encoding mechanism 160 plots distortion curves 552, 554 and 556 with respect to bit rate axis 560 and axis of distortions 570. Then, the coding mechanism 160 computes the points of convex hulls 580 by identifying points through all curves that form a boundary where all the points reside on one side of the boundary (in this case, the right side of the boundary) and that they are also such that connecting any two consecutive points on the convex hull with a straight line leaves all the remaining points on the same side. In this mode, coding mechanism 160 can generate convex hull points 580 for each scene sequence 220. Those skilled in the art will understand that many techniques for generating convex hulls are well known in the field of mathematics, and all such techniques can be implemented to generate the convex hull
[044] [044] Figure 6 illustrates how the coding mechanisms of figure 1B generate the convex hull data points of figure 5B, according to various modalities of the present invention. As shown, a distortion converter 600 and the convex hull analyzer 620 cooperatively process the data set 360 to generate the 580 convex hull points. In operation, the distortion converter 600 receives the data set 360 and then converts the data. OM values included in this data set for distortion values. Then, convex hull analyzer 620 computes The convex hull for the 360 data set to generate the 580 convex hull points.
[045] [045] In this mode, the encoding mechanism 160 computes the convex hull points 580 for each sequence of scenes 320 based on the associated data set 360. Thus, the encoding mechanism 160 generates P sets of convex hull points 580 based on the different P scene sequences 320. Again, each set of convex hull points 580 includes data points that describe, for a scene sequence, the distortion and bit rate for a particular resampled encoded version of the scene sequence. This version is resampled with a given resolution and encoded with a given QP. The encoding mechanism 160 collects all the convex hulls 580 generated for all P scene sequences 320 and then performs additional processing to generate the encoded video sequences 180, as described in more detail below in combination with figure 7.
[046] [046] Figure 7 illustrates how the encoding mechanisms of figure 1B generate different versions of the video sequence of figure 2 using a plurality of convex hulls, according to various embodiments of the present invention. As shown, a grid 700 iterator receives the 580 (0) to 580 (P) convex hull points and then iteratively updates a 710 sequence grid to generate the 720 sequence RD points. The 700 grid iterator is a software module included in coding mechanism 160. Sequence grid 710 is a data structure which is described in more detail below in combination with figures 8A-8D. Sequence RD points 720 include bit rate distortion points
[047] [047] Each RD point of sequence 720 corresponds to a different encoded video sequence 180. Each encoded video sequence 180 includes a different combination of the resampled encoded scene sequences discussed above. A streaming application 730 is configured to stream encoded video sequences 180 to an endpoint device based on the 720 sequence RD points. Each encoded video sequence 180 minimizes distortion (on average) through of all scene streams in the video stream for a given average bit rate associated with the video stream, as also discussed in more detail below in combination with figure 9. The grid iterator 700 generates these different streams using a technique described in more detail below.
[048] [048] Figures 8A-8D illustrate in more detail how the encoding mechanisms of figure 1B assemble blocks of video content in an encoded video sequence, in accordance with various embodiments of the present invention. As shown in figures 8A-8D, a sequence grid 710 includes a scene axis 800 and a bit rate axis 810. The sequence grid 710 also includes columns of convex hull points 580, where each column corresponds to a particular sequence of scenes. For example, the zero number column included in the sequence grid 710 corresponds to convex hull points 580 (0). Convex hull points within any column are classified according to ascending bit rate (and, by construction, descending distortion). Convex hull points are also guaranteed to have negative slopes that - in magnitude - are decreasing as a bit rate function.
[049] [049] For convenience, the 580 convex hull points are indexed individually according to the following system. For a given point, the first number is an index of the scene sequence, and the second number is an index in the bit rate classification of these hollow points. For example, the convex hull point 00 corresponds to the sequence of scenes of number zero and the bit rate classified with number zero (in this case the lowest bit rate). Similarly, convex hull point 43 corresponds to the fourth sequence of scenes and the bit rate classified as third (in this case the highest rated bit rate).
[050] [050] Each convex hull point included in grid 710 corresponds to a resampled coded version different from a 220 scene sequence, as described. The encoding mechanism 160 generates the encoded video sequences 180 shown in figure 1B by combining these resampled encoded versions of the scene sequences
[051] [051] Each of Figures 8A-8D illustrates a different version of the sequence grid 710 generated by the grid iterator 700 in a different iteration. Figure 8A illustrates the sequence grid 710 (0) in an initial state. Here, the grid iterator 700 generates the sequence 820 (0) of convex hull points that includes hull points 00, 10, 20, 30 and 40. These hull points selected initially have the lowest bit rate encoding and higher distortion, and for this reason reside at the bottom of the respective columns. Based on sequence 820 (0), grid iterator 700 generates an encoded video sequence 180 that includes resampled encoded scene sequences 220 associated with each of the convex hull points 00, 10, 20, 30 and 40. The grid iterator 700 also generates the RD point of sequence 720 (0) based on that encoded video sequence 180.
[052] [052] The grid iterator 710 then computes, for each convex hull point within the 820 (0) sequence, the rate of change of distortion in relation to the bit rate between the convex hull point and the neighbor above the convex hull. For example, grid iterator 710 can compute the rate of change of distortion in relation to the bit rate between nodes 00 and 01, 10 and 11, 20 and 21, 30 and 31, and 40 and 41. The rate of change computed for the convex hull point associated with a given resampled coded scene sequence 220 represents the derivative of the distortion curve associated with that scene sequence, obtained at the convex hull point.
[053] [053] The 710 grid iterator selects the derivative having the greatest magnitude, and then selects the neighbor above associated with that derivative for inclusion in a subsequent 820 sequence. For example, in figure 8B, the grid iterator 700 determines that the derivative associated with the convex hull point 30 is the largest, and for this reason includes the convex hull point 31l (the neighbor above the convex hull point 30) in sequence 820 (1). Based on sequence 820 (1), grid iterator 700 generates an encoded video sequence 180 that includes resampled encoded scene sequences 220 associated with each of the convex hull points 00, 10, 20, 31 and 40. The grid iterator 710 then generates the RD point of sequence 720 (1) based on that encoded video sequence 180. The grid iterator 710 performs this technique iteratively, thereby ascending grid 710, as shown in figures 8C- 8D .
[054] [054] In figure 8C, the grid iterator 700 determines that the derivative associated with the convex hull point 10 is the largest when compared to the other derivatives, and then selects convex hull point 11 for inclusion in the 820 sequence (2). Based on sequence 820 (2), grid iterator 700 generates an encoded video sequence 180 that includes resampled encoded scene sequences 220 associated with each of the convex hull points 11, 10, 20, 31 and 40. The grid iterator 700 also generates the RD point of sequence 720 (2) based on that encoded video sequence 180. grid iterator 700 continues this process until it generates sequence 820 (T) associated with grid iteration 710 (T ), as shown in figure 8D. In this mode, the grid iterator 700 improves the 820 sequences incrementally by selecting a single convex hull point for which bit rate is increased and distortion is decreased, thereby generating a collection of encoded video sequences 180 with bits increasing and distortion decreasing.
[055] [055] In one embodiment, the 700 grid iterator adds convex hull points before ascending the 710 grid in order to create a termination condition. In this way, the grid iterator 700 can duplicate convex hull points having the highest bit rate to induce the rate of change between the second to the last and the last convex hull point to be zero. When this rate of change of zero is detected for all scenes, that is, when the maximum magnitude of rate of change is exactly zero, the grid iterator 700 identifies the end condition and stops repeating.
[056] [056] Referring again to Figure 7, the grid iterator 700 generates the encoded video sequences 180 that correspond to the sequences 820 shown in Figures 8A-8D using the grid technique described earlier. Because the grid iterator 700 generates 820 streams in an ascending mode to reduce distortion and increase bit rate, encoded video streams 180 move from a high distortion range and low bit rate to a low distortion range and high bits. Each 720 sequence RD point provides the distortion and bit rate for a given encoded video sequence 180, and these 720 sequence RD points can be plotted to generate a convex hull, as discussed below in combination with the figure 9.
[057] [057] Figure 9 is a graph of convex hull data points generated for the different versions of the video sequence shown in Figures 8A-8D, according to various embodiments of the present invention. As shown, a graph 900 includes a bit rate axis 910 and a distortion axis 920. Curve 930 is plotted against the bit rate axis 910 and the distortion axis 920. Curve 930 can be generated based on the collection of 720 sequence RD points corresponding to the encoded video sequences 180 generated by means of the grid technique discussed previously in combination with figures 8A-8D. Therefore, curve 930 represents distortion as a function of the bit rate across all encoded video streams
[058] [058] Based on curve 930, the streaming application 730 of Figure 7 is able to select, for a given available bit rate, the particular encoded video sequence 180 that minimizes distortion for that bit rate. The streaming streaming application 730 can select a single encoded video stream 180 during streaming, or can dynamically select between video streams. For example, the streaming streaming application 730 can switch between encoded video streams 180 at scene boundaries. With this approach, the 730 streaming application can deliver a consistent quality video experience to the end user without requiring excessive bandwidth.
[059] [059] The coding mechanism 160 can implement variations in the technique described earlier in order to reduce storage and computational complexity. In one embodiment, the coding mechanism 160 implements a "restricted" version of the approach indicated above.
[060] [060] For example, the coding mechanism 160 can select a subset of possible values for the coding parameter based on one or more criteria related to efficiency. For example, the encoding mechanism 160 can select each second QP value, each third QP value, or each umpteenth QP value (where N is an integer ranging from 2 to 10, inclusive) within a smaller range of QP. In some alternative modalities, statistics can be collected from multiple encodings of various scene sequences to determine the statistical probability that different QP values will be selected by the encoding mechanism
[061] [061] In another embodiment, encoding mechanism 160 implements a "" iterative "version of the approach indicated above by which encoding mechanism 160 performs multiple encoding passes to determine an encoding having a target bit rate or level of distortion Initially, the encoding mechanism 160 can perform a first pass using a restricted range of QP values, such as the one discussed above in combination with the “restricted” approach. Once the encoding mechanism 160 has generated a convex hull of sequence RD points, such as that shown in Figure 9, the encoding mechanism 160 then identifies the sequence RD point closest to the target bit rate or target distortion level. or more nearby points on the convex hull and, based on the range of QPs associated with those points, performs additional coding. ication 160 can iteratively refine the range of QPs used for encoding to target a particular bit rate or distortion.
[062] [062] Also in another modality, the encoding mechanism 160 implements a "fixed quality" version of the approach indicated above and limits the number of scene encodings that need to be stored and processed subsequently. With this approach, the encoding mechanism 160 can produce scene encodings at predetermined well-spaced quality intervals. The encoding mechanism 160 can then assemble these scene encodings into the complete encoded video sequences 180 having a quality fixed through the entire sequence. The number of scene encodings implemented per scene sequence is a configurable parameter that represents a trade-off between quality and storage needs. In performing this technique, the encoding mechanism 160 processes the convex hull points 580 and then iteratively removes extraneous points until the remaining points represent the desired number of scene encodings. The coding mechanism 160, for example, can iteratively remove the convex hull points 580 having the smallest gap in relation to the adjacent convex hull points 580. This technique allows the encoding mechanism 160 to maximize the minimum quality of scene encodings.
[063] [063] In other modalities, the coding mechanism 160 implements a "minimum-maximum optimization" version of the approach indicated above. In such an implementation, the coding mechanism 160 selects a convex hull point for inclusion in a subsequent 820 sequence based on distortion metrics or quality metrics instead of derivative values. In particular, the coding mechanism 160 determines the convex hull point included in the sequence 820 (x) that has the maximum distortion metric (or the maximum quality metric) and then includes the neighbor above the convex hull point selected for inclusion in the subsequent sequence 820 (x + 1).
[064] [064] In related modalities, when ascending the sequence grid 710 the coding mechanism 160 can effect a compensatory exchange of changes in inclination between the points of convex hull 580 with real quality value. Thus, before selecting a 580 convex hull point for inclusion in a subsequent sequence, encoding mechanism 160 can filter out scene sequences (and the corresponding 580 convex hull points) with a quality metric below a given threshold ( or distortion metric above a given threshold). Only after restricting the scene sequences and convex hull points available in this mode does the encoding mechanism 160 generate a subsequent encoded video sequence 180 based on comparing slope values of the remaining 580 convex hull points. This approach can maximize both medium and minimum quality.
[065] [065] With any of the approaches discussed so far, the encoding mechanism 160 can be configured to force specific restrictions that limit encoding behavior. For example, the encoding mechanism 160 can be configured to limit the distortion of encoded scene sequences to always be below a maximum tolerable distortion level. However, adjustments to the coding mechanism 160 may be necessary to allow compliance with more complex restrictions. An example of a complex constraint is the video temporary storage tester (VBV) constraint, which is known to those skilled in the art. The VBV constraint generally determines that data must arrive at a relatively constant bit rate and must be stored in temporary storage having a relatively constant size. This restriction helps to avoid temporary storage overload and / or negative overcapacity, among other potential problems. More specific formulations of the VBV restriction are also known to those skilled in the art, including the VBV constant bit rate (CBR) restriction and the VBV variable bit rate (VBR) restriction, however discussion of these specific versions is omitted for brevity.
[066] [066] In one embodiment, the encoding mechanism 160 can be configured to perform the grid rise discussed previously in combination with figures 8A-8D in a mode that allows the final encoded video sequences 180 to be in accordance with arbitrarily sets constraint complexes, including the VBV constraint discussed above. In this way, the coding mechanism 160 analyzes not only the slope values between neighboring hull points 580 to select a new hull point for inclusion in a subsequent sequence, but also compliance of each possible subsequent sequence with one or more restrictions ( for example, VBV CBR, VBV VBR, and so on). Thus, for each convex hull point 580 that can potentially be included in a subsequent sequence, the coding mechanism 160 determines the degree to which that sequence complies with the restrictions. The coding mechanism 160 then selects the convex hull points 580 that allow subsequent sequences to maintain compliance. This form of grid ascension constitutes a “dynamic programming” approach, and can also represent a form of Viterbi solution for the specific problem of optimizing bit rate versus distortion.
[067] [067] In alternative embodiments, the encoding mechanism 180 and the streaming application 730 may cause encoded video sequences 180 to be delivered to endpoint devices in any technically feasible manner. In the same or other modalities, any amount and type of functionality associated with the encoding mechanism 180 and the streaming application 730 may be implemented on, or distributed through, any number of host computers 110, any number of cloud computers 140, any number of client computers (not shown), and any number of endpoint devices, in any technically feasible mode.
[068] [068] For example, in some embodiments, the encoding mechanism 180 configures the streaming streaming application 730 to deliver metadata to client applications running on endpoint devices. Metadata includes, but is not limited to, metrics associated with encoded video content at any level of granularity, such as bit rates and quality metrics associated with one or more encoded scene sequences and / or 180 encoded video sequences. Customers can perform any type and quantity of adaptive streaming transmission operations based on metadata in any technically feasible mode.
[069] [069] In one scenario, a user sets up a video player application to stream a movie to a laptop. The streaming application 730 transmits the metadata associated with the four different encoded video sequences 180 (1-4) to the video player application. Metadata indicates that the encoded video sequence 180 (4) is associated with the highest bit rate and the highest visual quality, while the encoded video sequence 180 (1) is associated with the lowest bit rate and the lower visual quality. At any time, the video player application selects the encoded video sequence 180 that provides the highest available visual quality during movie playback while avoiding playback interruptions due to reloading.
[070] [070] Based on an initial available bandwidth and metadata, the video player application configures the streaming 730 application to start streaming the 180 (4) encoded video stream to the player application of video. In this way, the video player application provides the highest available visual quality during movie playback. In general, due to Internet traffic, especially during peak hours during the day, connection conditions can change quickly and become very variable. In the scenario described, after ten minutes of playback, the available bandwidth decreases significantly. Based on the reduced bandwidth and metadata, the video player application configures the streaming streaming application 730 to dynamically switch between encoded video stream 180 (4) and encoded video stream 180 (1). At the next scene boundary, the streaming streaming application 730 starts streaming the encoded video stream 180 (1) instead of the encoded video stream 180 (4) to the video player application. Although the video player application is no longer able to provide the highest available visual quality during movie playback, the video player application successfully avoids playback interruptions due to reloading.
[071] [071] Those skilled in the art will understand that the techniques described so far are equally applicable for audio in addition to video. For example, the objective quality metric discussed earlier can provide a measure of audio quality. The remaining parts of the techniques indicated above otherwise would proceed in a similar manner.
[072] [072] Figure 10 is a flow chart of method steps for assembling blocks of video content in an encoded video sequence, according to various modalities of the present invention. Although the method steps are described in combination with the systems of figures 1-9, those skilled in the art will understand that any system configured to perform the method steps, in any order, is within the scope of the present invention.
[073] [073] As shown, a method 1000 starts at step 1002, where the encoding mechanism 160 receives the source video sequence 170. The source video sequence 170 includes a sequence of frames encoded in a native or “ distribution". In step 1004, the encoding mechanism 160 processes the source video sequence 170 to remove superfluous pixels. Such pixels can reside in horizontal or vertical black bars residing adjacent to the actual content of the video stream. In step 1006, encoding mechanism 160 divides source video sequence 170 into scene sequences
[074] [074] The method then proceeds to step 1008. In step 1008, for each scene sequence 220, the encoding mechanism 160 resamples the scene sequence M times to generate a resolution scale 330 of the resampled sequences 320, as shown in figure 3. Each resampled sequence 320 has a different resolution. A resampled sequence 320 has the same resolution as the original video sequence.
[075] [075] The method then proceeds to step 1010. For each resampled sequence 320 on the resolution scale 330, the encoding mechanism 160 processes the resampled sequence 320 through a processing thread 340 to generate data points 350. Steps specific processing steps performed by processing thread 340 are described in more detail below in combination with figure 11. Each data point 350 indicates, for a given resampled sequence 320, the sequence encoding resolution, a quality metric for the sequence and the QP value used to encode the sequence, as discussed in more detail below in combination with figure 11.
[076] [076] In step 1012, encoding mechanism 160 collects all data points 350 for all resampled sequences 320 on the resolution scale 330 to generate a 360 data set. The 360 data set corresponds to a 220 scene sequence Each data point in the 360 data set corresponds to a different encoding and different resolution of the scene sequence. In step 1014, the encoding mechanism 160 converts the quality metric associated with these data points to a distortion metric, and then generates the convex hull points 580 for the data set, as shown in figure 5B. The 580 convex hull points minimize distortion or bit rate across all resampled / encoded scene sequences.
[077] [077] In step 1016, coding mechanism 160 collects all 580 convex hull points across all resolution scales to generate a 710 sequence grid. The construction of an exemplary 710 sequence grid was discussed in detail in combination with figures 8A-8D. In step 1018, the encoding mechanism 160 iteratively ascends the sequence grid to generate a collection of the encoded video sequences 180 and the corresponding sequence RD points 720. An approach to ascending the sequence grid 710 is discussed in combination with the figure 12.
[078] [078] In step 1020, the streaming stream application 730 selects an encoded video stream 180 to stream based on the associated stream RD point 720. In this way, the streaming stream application can select a particular sequence RD point 720 that minimizes distortion for a given available bit rate, and then transmits the encoded video sequence 180 associated with that sequence RD point 720 to an endpoint device.
[079] [079] Figure 11 is a flow chart of method steps for processing a resampled sequence of scenes to generate a set of data points, according to various modalities of the present invention. Although the method steps are described in combination with the systems of figures 1-9, those skilled in the art will understand that any system configured to perform the method steps, in any order, is within the scope of the present invention.
[080] [080] The coding mechanism 160 implements a method 1100 to perform processing associated with a given sub-thread 450 within a processing thread 340. The coding mechanism 160 can execute multiple sub-threads 450 in parallel to implement a given processing thread 340, and so you can perform the 1100 method multiple times.
[081] [081] As shown, method 1100 begins at step 1102, where encoding mechanism 160 encodes a resampled sequence 320 with a selected quantization parameter (QP). In step 1104, encoding mechanism 160 then decodes the encoded sequence and, in step 1106, oversamples the decoded sequence to the resolution associated with source video sequence 170. In step 1108, encoding mechanism 160 generates one or more quality metrics (OMs) for the oversampled sequence. In step 1110, the encoding mechanism 160 generates a data point 440 that includes the resampled sequence resolution, the choice of quantization parameter (QP) and the quality metric (QM) generated for the encoded resampled video sequence.
[082] [082] Figure 12 is a flow chart of method steps for generating a set of encoded video sequences, according to various modalities of the present invention. Although the method steps are described in combination with the systems of figures 1-9, those skilled in the art will understand that any system configured to perform the method steps, in any order, is within the scope of the present invention.
[083] [083] As shown, a method 1200 starts at step 1202, where the coding mechanism 160 generates a sequence grid 710 based on convex hull points 580 for all scene sequences 220. The sequence grid 710, as as discussed previously in combination with figures 8A-8D, it includes individual columns of convex hull points 580, where each column corresponds to a particular sequence of scenes. Therefore, an encoded version of the source video sequence 170 can be constructed by collecting a resampled encoded scene sequence 220 from each such column.
[084] [084] In step 1204, the encoding mechanism 160 determines a sequence of convex hull points 580 having the lowest bit rate. In step 1206, the encoding mechanism 160 designates the determined sequence as the "current sequence". In step 1208, the encoding engine generates an encoded video sequence based on the current sequence. Thus, encoding mechanism 160 collects each resampled encoded scene sequence 220 associated with the sequence of convex hull points 580 to construct an encoded version of the source video sequence 170. In step 1210, encoding mechanism 160 generates a 720 sequence RD point based on that encoded video sequence.
[085] [085] In step 1212, the coding mechanism 160 computes the magnitude of the slope between each point of the convex hull in the current sequence and the neighboring convex hull point above. The “neighbor above” a given convex hull point resides immediately above the convex hull point and in the same column. In step 1214, the coding mechanism 160 identifies the convex hull point and the neighboring convex hull point above with the greatest inclination magnitude in relation to each other. In step 1216, the coding mechanism 160 generates a new sequence of convex hull points that replaces the convex hull point with the neighboring convex hull point above. Finally, in step 1218, the encoding mechanism 160 designates the new sequence as the “current sequence” and returns to step 1208. The encoding mechanism 160 can repeat method 1200 until it generates an encoded sequence 170 with maximum bit rate when compared to the other strings, or until another termination condition is satisfied.
[086] [086] In this mode, the encoding mechanism 160 "scales" the sequence grid 710 when determining subsequent versions of the current sequence that maximally reduce distortion and bit rate when compared to other versions. When ascending sequence grid 710 in this mode, encoding mechanism 160 does not need to consider all possible combinations of all resampled encoded scene sequences (also referred to in this document as "blocks"). Therefore, the encoding mechanism 160 can conserve considerable computing resources while still determining a spectrum of encoded video sequences that optimize distortion for a range of bit rates.
[087] [087] In summary, an encoding mechanism encodes a video stream to provide optimum quality for a given bit rate. The encoding engine divides the video sequence into a collection of scene sequences. Each scene sequence includes video frames captured from a particular capture point. The encoding engine resamples each sequence of scenes over a range of different resolutions, encodes each resampled sequence with a range of quality parameters, and then oversamples each encoded sequence to the original resolution of the video sequence. For each oversampled sequence, the coding engine computes a quality metric and generates a data point that includes the quality metric and the resampling resolution. The encoding mechanism collects all such data points and then computes the convex hull of the resulting data set. Based on all convex hulls across all scene sequences, the encoding mechanism determines an ideal collection of scene sequences for a range of bit rates.
[088] [088] At least one advantage of the techniques described in this document is that the video stream can be streamed to an end user with the best quality available for a given bit rate. Conversely, for a given desired quality, the video stream can be provided with the minimum possible bit rate.
[089] [089] 1. Some embodiments of the invention include a computer-implemented method, comprising: generating a first set of encoded blocks for a source video sequence, generating a first set of data points based on the first set of encoded blocks, carrying out one or more convex hull operations through the first set of data points to compute a first subset of data points that are optimized through at least two metrics, compute a first slope value between a first data point included in the first subset of data points and a second data point included in the first subset of data points, and determine, based on the first slope value, that a first coded block associated with the first data point must be included in a final coded version of the source video stream.
[090] [090] 2. The computer-implemented method of clause 1, in which generating the first set of encoded blocks comprises: identifying within the source video sequence a first frame sequence that is associated with a first capture point, resampling the first frame sequence in a plurality of different resolutions to generate a resolution scale of resampled versions of the first frame sequence, and to encode each resampled version of the first frame sequence with a different encoding parameter to generate the first set of encoded blocks.
[091] [091] 3. The computer-implemented method of any of clauses 1 and 2, in which generating the first set of data points comprises: decoding each coded block in the first set of coded blocks to generate a first set of decoded blocks, oversample each decoded block in the first set of decoded blocks to a source resolution associated with the source video sequence to generate a first set of oversampled blocks, and generate a different data point for each block shown in the first set of data blocks oversampled.
[092] [092] 4. The computer-implemented method of any of clauses 1, 2 and 3, in which a specific data point in the first set of data points is generated by means of: generating a specific objective quality metric for a specific oversampled block in the first set of oversampled blocks, convert the specific objective quality metric to a specific distortion metric, compute a bit rate for the specific oversampled block, combine the specific distortion metric and the bit rate to generate the point specific database.
[093] [093] 5. The computer-implemented method of any of clauses 1, 2, 3 and 4, in which to carry out one or more convex hull operations through the first set of data points to compute the first subset of data points comprises: determining a first region that includes the first set of data points, identifying a first boundary of the first region, where data points in the first set of points do not reside on a first side of the first boundary, discard any data points that they do not reside along the first boundary, where each data point that resides along the first boundary optimizes the first metric over the second metric.
[094] [094] 6. The computer-implemented method of any of clauses 1, 2, 3, 4 and 5, in which the first metric comprises distortion and the second metric comprises bit rate.
[095] [095] 7. The computer implemented method of any one of clauses 1, 2, 3, 4, 5 and 6, further comprising: generating a second set of encoded blocks for the source video sequence, generating a second set of data points based on the second set of coded blocks, perform one or more convex hull operations through the second set of data points to compute a second subset of data points that are optimized using at least two metrics, and compute one second slope value between a third data point included in the second subset of data points and a fourth data point included in the second subset of data points.
[096] [096] 8. The computer-implemented method of any of clauses 1, 2, 3, 4, 5, 6 and 7, in which it determines that the first coded block associated with the first data point must be included in the coded version The end of the source video sequence comprises determining that the first slope has a magnitude greater than that of the second slope.
[097] [097] 9. The computer-implemented method of any of clauses 1, 2, 3, 4, 5, 6, 7 and 8, further comprising determining that a second coded block associated with the fourth data point should be included in another encoded version of the source video sequence based on determining that the second slope value is greater than other slope values associated with other subsets of data points.
[098] [098] 10. The computer-implemented method of any of clauses 1, 2, 3, 4, 5, 6, 7, 8 and 9, in which the first set of coded blocks is associated with a first sequence of frames of video captured continuously from a first capture point, and a second set of encoded blocks is associated with a second sequence of video frames captured continuously from a second capture point.
[099] [099] 11. A computer readable non-transitory medium storing program instructions that, when executed by a processor, configure the processor to perform the steps of: generating a first set of encoded blocks for a source video sequence, generating a first set of data points based on the first set of coded blocks, perform one or more convex hull operations through the first set of data points to compute a first subset of data points that are optimized using at least two metrics, compute a first slope value between a first data point included in the first subset of data points and a second data point included in the first subset of data points, and determine, based on the first slope value, that a first block encoded associated with the first data point must be included in a final encoded version of the source video stream .
[0100] [0100] 12. The computer readable non-transitory medium of clause 11, in which the step of generating the first set of encoded blocks comprises identifying within the source video sequence a first frame sequence that is associated with a first capture, resample the first frame sequence in a plurality of different resolutions to generate a resolution scale of resampled versions of the first frame sequence, and encode each resampled version of the first frame sequence with a different encoding parameter to generate the first set of coded blocks.
[0101] [0101] 13. The computer readable non-transitory medium of any of clauses 11 and 12, in which the step of generating the first set of encoded blocks comprises generating a plurality of values for an encoding parameter based on a plurality of possible values and a maximum number of coded blocks, where the total number of values included in the plurality of values is less than the total number of possible values included in the plurality of possible values; and encoding a plurality of resampled versions of a first frame sequence based on the plurality of values for the encoding parameter to generate the first set of encoded blocks.
[0102] [0102] 14. The computer readable non-transitory medium of any of clauses 11, 12 and 13, in which the step of generating the first set of data points comprises decoding each coded block in the first set of coded blocks to generate a first set of decoded blocks; oversampling each decoded block in the first set of decoded blocks to a source resolution associated with the source video sequence to generate a first set of oversampled blocks; and generate a different data point for each block shown in the first set of data blocks shown.
[0103] [0103] 15. The computer readable non-transitory medium of any of clauses 11, 12, 13 and 14,
[0104] [0104] 16. The computer-readable non-transitory medium of any of clauses 11, 12, 13, 14 and 15, in which the first metric comprises distortion and the second metric comprises bit rate.
[0105] [0105] 17. The computer-readable non-transitory medium of any of clauses 11, 12, 13, 14, 15 and 16, further comprising the steps of: generating a second set of encoded blocks for the source video sequence, generate a second set of data points based on the second set of encoded blocks perform one or more convex hull operations through the second set of data points to compute a second subset of data points that are optimized using at least two metrics , and compute a second slope value between a third data point included in the second subset of data points and a fourth data point included in the second subset of data points.
[0106] [0106] 18. The computer readable non-transitory medium of any of clauses 11, 12, 13, 14,
[0107] [0107] 19. The computer readable non-transitory medium of any of clauses 11, 12, 13, 14, 15, 16, 17 and 18, further comprising determining that a second coded block associated with the fourth data point must not be included in another encoded version of the source video sequence based on determining that the second slope value is less than one or more other slope values associated with one or more other subsets of data points.
[0108] [0108] 20. The computer readable non-transitory medium of any of clauses 11, 12, 13, 14, 15, 16, 17, 18 and 19, in which the first set of coded blocks is associated with a first sequence of scenes and a second set of coded blocks is associated with a second sequence of scenes.
[0109] [0109] 21. Some modalities include a system, comprising: a memory storing a software application, and a processor that is coupled to the memory and, when executing the software application, it is configured to: generate a first set of coded blocks for a source video sequence, generate a first set of data points based on the first set of encoded blocks, perform one or more convex hull operations through the first set of data points to compute a first subset of data points that are optimized using at least two metrics, compute a first slope value between a first data point included in the first subset of data points and a second data point included in the first subset of data points, and determine, based on the first slope value, that a first coded block associated with the first data point must be included in a final coded version of the source video stream.
[0110] [0110] 22. The system of clause 21, in which, when executing the software application, the processor is additionally configured to: generate the first set of coded blocks, generate the first set of data points, perform one or more convex hull operations, compute the first slope value, and determine that the first encoded block associated with the first data point must be included in the final encoded version of the source video sequence.
[0111] [0111] Any combinations of any claim elements reported in any of the claims and / or any elements described in this application, in any way, are included in the considered scope of the present invention and protection.
[0112] [0112] The descriptions of the various modalities have been presented for the purpose of illustration, and are not intended to be exhaustive or limited to the revealed modalities. Many modifications and variations will be apparent to people of ordinary skill in the art without departing from the scope and spirit of the modalities described.
[0113] [0113] Aspects of the present modalities can be incorporated as a computer program system, method or product. Therefore, aspects of the present disclosure can take the form of a fully hardware modality, a fully software modality (including firmware, resident software, microcode, etc.) or a modality combining software and hardware aspects that can be referred to in in a general way in this document as a "module" or "system". In addition, any hardware and / or software technique, process, function, component, mechanism, module or system described in the present disclosure can be implemented as a circuit or set of circuits. In addition, aspects of the present disclosure may take the form of a computer program product embedded in one or more computer-readable media having computer-readable program code incorporated therein.
[0114] [0114] Any combination of one or more computer-readable media can be used. The computer-readable medium can be a computer-readable signal medium or a computer-readable storage medium. A computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared medium or semiconductor system, apparatus or device, or any suitable combination of those indicated above. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable floppy disk, a hard disk, a random access memory (RAM), a memory only (ROM), a programmable and erasable read-only memory (EPROM or Flash memory), an optical fiber, a read-only portable compact disc (CD-ROM) memory, an optical storage device, a storage device magnetic, or any suitable combination of those indicated above. In the context of this document, a computer-readable storage medium can be any tangible medium that can contain or store a program for use by or in connection with a system, apparatus or device for executing instructions.
[0115] [0115] Aspects of the present disclosure have been described above with reference to illustrations of flowcharts and / or block diagrams of methods, apparatus (systems) and computer program products in accordance with disclosure modalities. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or in block diagrams, can be implemented by means of computer program instructions. These computer program instructions can be provided for a general-purpose computer processor, special-purpose computer, or for another programmable data processing device to produce a machine. The instructions, when executed by the computer processor or another programmable data processing device, enable the implementation of the functions / procedures specified in the flowchart and / or block diagram blocks or blocks. Such processors can be, without limitation, general purpose processors, special use processors, application specific processors or field programmable port arrays.
[0116] [0116] The flowcharts and block diagrams in the figures illustrate the architecture, functionality and operation of possible implementations of systems, methods and products of computer programs in accordance with various modalities of the present disclosure. In this regard, each block in the flowcharts or block diagrams can represent a module, segment or part of code, which comprises one or more executable instructions to implement the specified logical function (s). It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession, in fact, can be executed substantially concurrently, or the blocks can sometimes be executed in reverse order, depending on the functionality involved. It should also be noted that each block in the block diagrams and / or flowchart illustration, and combinations of blocks in the block diagrams and / or in the flowchart illustration, can be implemented through special-purpose hardware-based systems that perform specified functions or procedures, or by combinations of special-purpose hardware and computer instructions.
[0117] [0117] Although the foregoing is directed to the modalities of the present disclosure, other and additional modalities of the disclosure can be devised without deviating from the basic scope of the same, and the scope of the same is determined by the following claims.
权利要求:
Claims (22)
[1]
1. Computer implemented method, characterized by the fact that it comprises: generating a first set of encoded blocks for a source video sequence; generate a first set of data points based on the first set of coded blocks; perform one or more convex hull operations through the first set of data points to compute a first subset of data points that are optimized through at least two metrics; computing a first slope value between a first data point included in the first subset of data points and a second data point included in the first subset of data points; and determining, based on the first slope value, that a first encoded block associated with the first data point must be included in a final encoded version of the source video sequence.
[2]
2. Method implemented by computer, according to claim l1, characterized by the fact that generating the first set of encoded blocks comprises: identifying within the source video sequence a first frame sequence that is associated with a first capture point ; resample the first frame sequence in a plurality of different resolutions to generate a resolution scale of resampled versions of the first frame sequence; and encoding each resampled version of the first frame sequence with a different encoding parameter to generate the first set of encoded blocks.
[3]
3. Computer implemented method, according to claim l1, characterized by the fact that generating the first set of data points comprises: decoding each coded block in the first set of coded blocks to generate a first set of decoded blocks; oversampling each decoded block in the first set of decoded blocks to a source resolution associated with the source video sequence to generate a first set of oversampled blocks; and generate a different data point for each block shown in the first set of data blocks shown.
[4]
4. Computer implemented method, according to claim 3, characterized by the fact that a specific data point in the first set of data points is generated by means of: generating a specific objective quality metric for a specific oversampled block in the first set of oversampled blocks; convert the specific objective quality metric to a specific distortion metric; compute a bit rate for the specific oversampled block; combine the specific distortion metric and the bit rate to generate the specific data point.
[5]
5. Computer implemented method, according to claim 1, characterized by the fact that performing one or more convex hull operations through the first set of data points to compute the first subset of data points comprises: determining a first region which includes the first set of data points; identify a first boundary of the first region, where data points in the first set of points do not reside on a first side of the first boundary; discard any data points that do not reside along the first boundary, where each data point that resides along the first boundary optimizes the first metric over the second metric.
[6]
6. Method implemented by computer, according to claim 5, characterized by the fact that the first metric comprises distortion and the second metric comprises bit rate.
[7]
7. Method implemented by computer, according to claim l1, characterized by the fact that it additionally comprises: generating a second set of encoded blocks for the source video sequence; generate a second set of data points based on the second set of coded blocks; perform one or more convex hull operations through the second set of data points to compute a second subset of data points that are optimized through at least two metrics; and computing a second slope value between a third data point included in the second subset of data points and a fourth data point included in the second subset of data points.
[8]
8. Computer-implemented method according to claim 7, characterized in that determining that the first coded block associated with the first data point must be included in the final coded version of the source video sequence comprises determining that the first slope has a magnitude greater than that of the second slope.
[9]
9. Computer-implemented method according to claim 7, characterized in that it further comprises determining that a second coded block associated with the fourth data point must be included in another coded version of the source video sequence based on in determining that the second slope value is greater than other slope values associated with other subsets of data points.
[10]
10. Method implemented by computer, according to claim 1, characterized by the fact that the first set of encoded blocks is associated with a first sequence of video frames captured continuously from a first capture point, and a second set of blocks encoded is associated with a second sequence of video frames captured continuously from a second capture point.
[11]
11. Computer readable non-transitory medium, characterized by the fact that it stores program instructions that, when executed by a processor, configure the processor to perform the steps of: generating a first set of encoded blocks for a source video sequence; generate a first set of data points based on the first set of coded blocks; perform one or more convex hull operations through the first set of data points to compute a first subset of data points that are optimized through at least two metrics; computing a first slope value between a first data point included in the first subset of data points and a second data point included in the first subset of data points; and determining, based on the first slope value, that a first encoded block associated with the first data point must be included in a final encoded version of the source video sequence.
[12]
12. Computer readable non-transitory medium, according to claim 11, characterized by the fact that the step of generating the first set of encoded blocks comprises: identifying within the source video sequence a first frame sequence that is associated with a first capture point; resample the first frame sequence in a plurality of different resolutions to generate a resolution scale of resampled versions of the first frame sequence; and encoding each resampled version of the first frame sequence with a different encoding parameter to generate the first set of encoded blocks.
[13]
13. Computer readable non-transitory medium, according to claim 11, characterized by the fact that the step of generating the first set of encoded blocks comprises: generating a plurality of values for a coding parameter based on a plurality of values possible and in a maximum number of coded blocks, where the total number of values included in the plurality of values is less than the total number of possible values included in the plurality of possible values; and encoding a plurality of resampled versions of a first frame sequence based on the plurality of values for the encoding parameter to generate the first set of encoded blocks.
[14]
14. Computer readable non-transitory medium, according to claim 11, characterized by the fact that the step of generating the first set of data points comprises: decoding each coded block in the first set of coded blocks to generate a first set of decoded blocks; oversampling each decoded block in the first set of decoded blocks to a source resolution associated with the source video sequence to generate a first set of oversampled blocks; and generate a different data point for each block shown in the first set of data blocks shown.
[15]
15. Computer readable non-transitory medium, according to claim 11, characterized by the fact that the step of performing one or more convex hull operations through the first set of data points to compute the first subset of data points comprises : determine a first region that includes the first set of data points; identify a first boundary of the first region, where data points in the first set of points do not reside on a first side of the first boundary; include any data points that reside along the first boundary in the first subset of data points.
[16]
16. Computer-readable non-transient medium according to claim 15, characterized by the fact that the first metric comprises distortion and the second metric comprises bit rate.
[17]
17. Computer readable non-transitory medium, according to claim 11, characterized by the fact that it additionally comprises the steps of: generating a second set of encoded blocks for the source video sequence; generate a second set of data points based on the second set of coded blocks; perform one or more convex hull operations through the second set of data points to compute a second subset of data points that are optimized through at least two metrics; and computing a second slope value between a third data point included in the second subset of data points and a fourth data point included in the second subset of data points.
[18]
18. Computer readable non-transitory medium according to claim 17, characterized in that determining that the first coded block associated with the first data point must be included in the final coded version of the source video sequence comprises determining that the first slope has a magnitude greater than that of the second slope.
[19]
19. Computer readable non-transitory medium according to claim 17, characterized in that it further comprises determining that a second coded block associated with the fourth data point should not be included in another coded version of the video sequence of origin based on determining that the second slope value is less than one or more other slope values associated with one or more other subsets of data points.
[20]
20. Computer readable non-transitory medium according to claim 17, characterized by the fact that the first set of coded blocks is associated with a first sequence of scenes and a second set of coded blocks is associated with a second sequence of scenes .
[21]
21. System characterized by the fact that it comprises: a memory storing a software application; and a processor that is coupled to memory and, when running the software application, is configured to: generate a first set of encoded blocks for a source video sequence, generate a first set of data points based on the first set of coded blocks, perform one or more convex hull operations through the first set of data points to compute a first subset of data points that are optimized through at least two metrics, compute a first slope value between a first data point included in the first subset of data points and a second data point included in the first subset of data points, and determine, based on the first slope value, that a first coded block associated with the first data point should be included in a final encoded version of the source video stream.
[22]
22. System, according to claim 21, characterized by the fact that, when executing the software application, the processor is additionally configured to: generate the first set of coded blocks; generate the first set of data points; perform one or more convex hull operations; compute the first slope value; and determining that the first encoded block associated with the first data point must be included in the final encoded version of the source video sequence.
THE
EE HW fes |: 3; : E | : 3 ã ê &: 4 [5 SS. | | 1 "|: q o <rg 2 '
SEQUENCE
VIDEO OF 17 IMECANISMODECODIFFACTION Lú) 160 | - MECHANISM OF | | CODING | | HOSTEL | | 18 l | ! | ! | ! | l | | MECANISMODE MECANISMODE | | |
CODIFICATION CODIFICATION | IN CLOUD IN CLOUD] | 14800) 1480N)! | | L ————. —M ———— [1
LIT
OF VIDEO
CODED 180 FIGURE 1B
DEVÍDEODE SEQUENCE |, sound
ORIGIN SEQUENCE 170 DECADES | 2048 22000) 4096
DECENT SCENE SEQUENCE ANALYZER | 2048 200 2200) 4096
DECADE SEQUENCE | 2048 2200) FIGURE 2
E R- E i.) Fi e ... ii "E E o g À 3: i L be« | Be. G. Ê
AND
E>
SA «8) | ºss es; Sãdss are) co js k Fe sé 8 + Se 5º ê.
2 or 502
QUALITY METRIC 504 so 506 TAXADE BITS 510 FIGURE 5A 2
DISTORTION 570 556 554 580 - “552 BIT RATE 560 FIGURE 5B
FIGURE 6 SAVER ANALYZING CONVERTER SET
DISTORTION DATA SHEET
CONVEX CONVEX 2380 sm 5860
!
: E t | Ã Ê EO | OO! eodo | *, OI ERA ooo is: ooo: & oooo | OOOO | à £ —- Ooo £ - Oo0oo E v PE 255] g E E
À À OOo! OO! cooo ii; ecodoi, oeeco eo: eooo | is eodo |) is £ - OO00 £ - OO00P and 2 Eos 2 | ê 255] g
DISTORTION s20 2 ”922 of TO | 930 | | 12 TAXADEBITS 910 FIGURE 9
”RECEIVE ORIGINAL VIDEO SEQUENCE 1002 PROCESS ORIGINAL VIDEO SEQUENCE 1004
TO REMOVE SUPPLEMENTARY PIXELS DIVIDE ORIGINAL VIDEO SEQUENCE IN 1006
SCENE SEQUENCES FOR EACH SCENE SEQUENCE, SHOW SEQUENCE 1008
OF SCENES M TIMES TO GENERATE SCALE OF RESOLUTIONS OF
SEQUENCES RESAMED FOR EACH SEQUENCE RESAMED ON A SCALE OF 1010 RESOLUTIONS, PROCESS SEQUENCE RESAMED VIA
PROCESSING CHAIN TO GENERATE
DATA POINTS COLLECT ALL DATA POINTS FOR ALL 1012
SEQUENCES RESAMED IN THE RESOLUTIONS SCALE
TO GENERATE A DATA SET GENERATE CONVEX HULL POINTS FOR DATA SET | - 1014 COLLECT ALL CONVEX HULL POINTS THROUGH 1016
ALL RESOLUTION SCALES TO GENERATE A GRID ITERATIVELY ASCEND THE GRID TO GENERATE A 1018
COLLECTION OF VIDEO SEQUENCES AND DATA POINTS
MATCHING SEQUENCE SELECTING A POINT-BASED VIDEO SEQUENCE OF [> 1020
SEQUENCE DATA FOR DOWNWARD CONSUMPTION FIGURE 10
”ENCODING SEQUENCE RESAMED WITH A 1102
SELECTED QUANTIZATION PARAMETER DECODING 1104 CODED SEQUENCE OVER SHOWING DECODED SEQUENCE TO 1106
ORIGIN RESOLUTION GENERATE ONE OR MORE QUALITY METRIC WITH 1108
SEQUENCE BASE OVER SAMPLED GENERATE DATA POINT THAT INCLUDES RESOLUTION 1110 OF RESAMED SEQUENCE, PARAMETERS OF
QUALITY QUANTIZATION AND METRIC FIGURE 11
Fa GENERATE GRID BASED ON CONVEX HULL POINTS = -> 1202 FOR ALL SCENE SEQUENCES | DETERMINE SEQUENCE OF CONVEX HULL POINTS 1204 HAVING LOWER BIT RATE | DESIGN SEQUENCE DETERMINED AS SEQUENCE -> 1206 CURRENT Fr GENERATE VIDEO SEQUENCE ENCODED BASED ON 1208 CURRENT SEQUENCE »GENERATE SEQUENCE RD POINT COMBASE == -> 1210
ENCODED VIDEO SEQUENCE COMPUTING SLOPE MAGNITUDE BETWEEN EACH = - 1212 CONVEX HULL POINT IN CURRENT SEQUENCE AND |
NEIGHBORHOOD CONVEX POINT ABOVE IDENTIFY CONVEX HULL POINT AND - -> 1214
NEIGHBORHOOD CONVEX UP WITH LARGER DP RELATIVE SLOPE | GENERATE A NEW HULL POINT SEQUENCE 1216
CONVEX THAT REPLACES THE CONVEX HULL POINT
WITH THE NEIGHBORHOOD CONVEX POINT ABOVE DESIGNATE NEW SEQUENCE AS CURRENT SEQUENCE - [> 1218 FIGURE 12
类似技术:
公开号 | 公开日 | 专利标题
BR112020000998A2|2020-07-14|encoding techniques to optimize distortion and bit rate
CN110313183B|2021-11-12|Iterative techniques for encoding video content
US10742708B2|2020-08-11|Iterative techniques for generating multiple encoded versions of a media title
US20190028529A1|2019-01-24|Encoding techniques for optimizing distortion and bitrate
US11153585B2|2021-10-19|Optimizing encoding operations when generating encoded versions of a media title
US10841356B2|2020-11-17|Techniques for encoding a media title while constraining bitrate variations
US11196791B2|2021-12-07|Techniques for encoding a media title while constraining quality variations
US11234034B2|2022-01-25|Techniques for encoding a media title via multiple encoders
同族专利:
公开号 | 公开日
JP2020530955A|2020-10-29|
AU2018303643A1|2020-02-06|
US10666992B2|2020-05-26|
AU2018303643B2|2021-08-19|
WO2019018311A1|2019-01-24|
CA3069875A1|2019-01-24|
KR102304143B1|2021-09-23|
JP6953613B2|2021-10-27|
KR20200024325A|2020-03-06|
US20190028745A1|2019-01-24|
US20200288187A1|2020-09-10|
AU2021269405A1|2021-12-16|
CN111066327A|2020-04-24|
EP3656129A1|2020-05-27|
SG11202000395WA|2020-02-27|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US5612742A|1994-10-19|1997-03-18|Imedia Corporation|Method and apparatus for encoding and formatting data representing a video program to provide multiple overlapping presentations of the video program|
US6625322B1|1999-06-08|2003-09-23|Matsushita Electric Industrial Co., Ltd.|Image coding apparatus|
JP2004511976A|2000-10-10|2004-04-15|コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ|Video bit rate control method and apparatus for digital video recording|
US7103669B2|2001-02-16|2006-09-05|Hewlett-Packard Development Company, L.P.|Video communication method and system employing multiple state encoding and path diversity|
US7400774B2|2002-09-06|2008-07-15|The Regents Of The University Of California|Encoding and decoding of digital data using cues derivable at a decoder|
CN1778117A|2003-04-18|2006-05-24|皇家飞利浦电子股份有限公司|System and method for rate-distortion optimized data partitioning for video coding using parametric rate-distortion model|
WO2005029868A1|2003-09-23|2005-03-31|Koninklijke Philips Electronics, N.V.|Rate-distortion video data partitioning using convex hull search|
US7394410B1|2004-02-13|2008-07-01|Samplify Systems, Inc.|Enhanced data converters using compression and decompression|
JP4037839B2|2004-03-11|2008-01-23|株式会社東芝|Image coding method and apparatus|
DE102005042134B4|2005-09-05|2007-08-02|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Apparatus, method and computer program for coding parameter determination for a hybrid coding scheme|
WO2007075196A1|2005-09-07|2007-07-05|Vidyo, Inc.|System and method for a high reliability base layer trunk|
US7876819B2|2005-09-22|2011-01-25|Qualcomm Incorporated|Two pass rate control techniques for video coding using rate-distortion characteristics|
US20080043832A1|2006-08-16|2008-02-21|Microsoft Corporation|Techniques for variable resolution encoding and decoding of digital video|
JP5369893B2|2008-05-30|2013-12-18|株式会社Jvcケンウッド|Video encoding device, video encoding method, video encoding program, video decoding device, video decoding method, video decoding program, video re-encoding device, video re-encoding method, video re-encoding Encoding program|
US8396114B2|2009-01-29|2013-03-12|Microsoft Corporation|Multiple bit rate video encoding using variable bit rate and dynamic resolution for adaptive video streaming|
US20110052087A1|2009-08-27|2011-03-03|Debargha Mukherjee|Method and system for coding images|
US8171154B2|2009-09-29|2012-05-01|Net Power And Light, Inc.|Method and system for low-latency transfer protocol|
US9584700B2|2009-10-26|2017-02-28|Hewlett-Packard Development Company, L.P.|Color separation table optimized for a printing process according to a print attribute by selecting particular Neugebauer primaries and Neugebauer primary area coverages|
FR2963189B1|2010-07-20|2013-06-21|Freebox|METHOD FOR ADAPTIVE ENCODING OF A DIGITAL VIDEO STREAM, IN PARTICULAR FOR XDSL LINE BROADCAST.|
US8837601B2|2010-12-10|2014-09-16|Netflix, Inc.|Parallel video encoding based on complexity analysis|
JP6134650B2|2011-01-28|2017-05-24|アイ アイオー,リミテッド・ライアビリティ・カンパニーEye Io,Llc|Applicable bit rate control based on scene|
US9451284B2|2011-10-10|2016-09-20|Qualcomm Incorporated|Efficient signaling of reference picture sets|
JP5964446B2|2011-11-09|2016-08-03|フラウンホッファー−ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ|Inter-layer prediction between different dynamic sample value ranges|
WO2013083199A1|2011-12-09|2013-06-13|Telefonaktiebolaget L M Ericsson |Method and apparatus for detecting quality defects in a video bitstream|
US9571827B2|2012-06-08|2017-02-14|Apple Inc.|Techniques for adaptive video streaming|
US9125073B2|2012-08-03|2015-09-01|Intel Corporation|Quality-aware adaptive streaming over hypertext transfer protocol using quality attributes in manifest file|
EP2939420B1|2013-01-15|2018-03-14|Huawei Technologies Co., Ltd.|Using quality information for adaptive streaming of media content|
US10019985B2|2013-11-04|2018-07-10|Google Llc|Asynchronous optimization for sequence training of neural networks|
JP6271756B2|2013-12-02|2018-01-31|ドルビー・インターナショナル・アーベー|Method of bit rate signaling and bit stream format enabling the method|
US9767101B2|2014-06-20|2017-09-19|Google Inc.|Media store with a canonical layer for content|
US20160073106A1|2014-09-08|2016-03-10|Apple Inc.|Techniques for adaptive video streaming|
EP3041233A1|2014-12-31|2016-07-06|Thomson Licensing|High frame rate-low frame rate transmission technique|
US9749646B2|2015-01-16|2017-08-29|Microsoft Technology Licensing, Llc|Encoding/decoding of high chroma resolution details|
CN104767999B|2015-04-22|2017-11-17|福州大学|A kind of HEVC Rate Controls model parameter more new algorithm based on distortion measurement|
US9734409B2|2015-06-24|2017-08-15|Netflix, Inc.|Determining native resolutions of video sequences|
US10602153B2|2015-09-11|2020-03-24|Facebook, Inc.|Ultra-high video compression|
US10255667B2|2015-12-23|2019-04-09|Vmware, Inc.|Quantitative visual perception quality measurement for virtual desktops|
US20180063549A1|2016-08-24|2018-03-01|Ati Technologies Ulc|System and method for dynamically changing resolution based on content|
AU2017368324A1|2016-12-01|2019-05-30|Brightcove, Inc.|Optimization of encoding profiles for media streaming|
US10742708B2|2017-02-23|2020-08-11|Netflix, Inc.|Iterative techniques for generating multiple encoded versions of a media title|
US10897618B2|2017-02-23|2021-01-19|Netflix, Inc.|Techniques for positioning key frames within encoded video sequences|
US11153585B2|2017-02-23|2021-10-19|Netflix, Inc.|Optimizing encoding operations when generating encoded versions of a media title|
US20190028529A1|2017-07-18|2019-01-24|Netflix, Inc.|Encoding techniques for optimizing distortion and bitrate|US10742708B2|2017-02-23|2020-08-11|Netflix, Inc.|Iterative techniques for generating multiple encoded versions of a media title|
US10897618B2|2017-02-23|2021-01-19|Netflix, Inc.|Techniques for positioning key frames within encoded video sequences|
US11153585B2|2017-02-23|2021-10-19|Netflix, Inc.|Optimizing encoding operations when generating encoded versions of a media title|
US11166034B2|2017-02-23|2021-11-02|Netflix, Inc.|Comparing video encoders/decoders using shot-based encoding and a perceptual visual quality metric|
FR3078798B1|2018-03-12|2021-04-30|Ateme|METHOD OF SELECTING ENCODING PROFILES OF A MULTIMEDIA CONTENT FOR ON-DEMAND BROADCASTING|
US10616590B1|2018-05-16|2020-04-07|Amazon Technologies, Inc.|Optimizing streaming video encoding profiles|
US11128869B1|2018-10-22|2021-09-21|Bitmovin, Inc.|Video encoding based on customized bitrate table|
US10965945B2|2019-03-29|2021-03-30|Bitmovin, Inc.|Optimized multipass encoding|
US10897654B1|2019-09-30|2021-01-19|Amazon Technologies, Inc.|Content delivery of live streams with event-adaptive encoding|
US11115697B1|2019-12-06|2021-09-07|Amazon Technologies, Inc.|Resolution-based manifest generator for adaptive bitrate video streaming|
US10958947B1|2020-03-12|2021-03-23|Amazon Technologies, Inc.|Content delivery of live streams with playback-conditions-adaptive encoding|
US11190826B1|2020-06-25|2021-11-30|Disney Enterprises, Inc.|Segment quality-guided adaptive stream creation|
CN113810629B|2021-11-19|2022-02-08|南京好先生智慧科技有限公司|Video frame processing method and device for multimedia signal of fusion platform|
法律状态:
2021-11-03| B350| Update of information on the portal [chapter 15.35 patent gazette]|
优先权:
申请号 | 申请日 | 专利标题
US201762534170P| true| 2017-07-18|2017-07-18|
US62/534,170|2017-07-18|
US16/034,303|US10666992B2|2017-07-18|2018-07-12|Encoding techniques for optimizing distortion and bitrate|
US16/034,303|2018-07-12|
PCT/US2018/042338|WO2019018311A1|2017-07-18|2018-07-16|Encoding techniques for optimizing distortion and bitrate|
[返回顶部]